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Abstract 

Cameras for imaging in short and mid-wave infrared 
spectra are significantly more expensive than their counter¬ 
parts in visible imaging. As a result, high-resolution imag¬ 
ing in those spectrum remains beyond the reach of most con¬ 
sumers. Over the last decade, compressive sensing (CS) has 
emerged as a potential means to realize inexpensive short¬ 
wave infrared cameras. One approach for doing this is the 
single-pixel camera (SPC) where a single detector acquires 
coded measurements of a high-resolution image. A compu¬ 
tational reconstruction algorithm is then used to recover the 
image from these coded measurements. Unfortunately, the 
measurement rate of a SPC is insufficient to enable imaging 
at high spatial and temporal resolutions. 

We present a focal plane array-based compressive sens¬ 
ing (FPA-CS) architecture that achieves high spatial and 
temporal resolutions. The idea is to use an array of SPCs 
that sense in parallel to increase the measurement rate, and 
consequently, the achievable spatio-temporal resolution of 
the camera. We develop a proof-of-concept prototype in the 
short-wave infrared using a sensor with 64 x 64 pixels; the 
prototype provides a 4096x increase in the measurement 
rate compared to the SPC and achieves a megapixel resolu¬ 
tion at video rate using CS techniques. 


1. Introduction 

The cost of a high-resolution sensors in the visible spec¬ 
trum has fallen dramatically over the last decade. For ex¬ 
ample, a cellphone camera module boasting a sensor with 
several megapixels costs little more than a few dollars.This 
trend is fueled by the fact that silicon is sensitive to the vis¬ 
ible region of the electromagnetic spectrum and hence, the 
scaling trends and advances made in silicon-based semicon¬ 
ductor fabrication directly benefit visible imaging technolo¬ 
gies. Unfortunately, these scaling trends do not extend to 
imaging beyond the visible spectrum. 

Motivation. In many application domains, imaging be¬ 
yond the visible spectrum provides significant benefits over 
traditional visible sensors ifT^ . For example, short-wave in- 
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Figure 1. Focal plane array-based compressive sensing (FPA-CS) 
camera with a 64 x 64 SWIR sensor array is equivalent to 4096 
single pixel cameras (SPCs) operating in parallel. This results in 
vastly superior spatio-temporal resolutions against what is achiev¬ 
able using the SPC or a traditional camera. 


frared (SWIR) penetrates fog and smog; this enables imag¬ 
ing through scattering media. The night-glow of the sky 
naturally provides SWIR illumination which enables SWIR 
sensors to passively image even in the dark. SWIR imaging 
also enables a wide variety of biomedical applications |[30l . 
Yet, SWIR imaging requires sensors made of exotic materi¬ 
als such as indium gallium arsenide (InGaAs), which are or¬ 
ders of magnitude more expensive than silicon. As a conse¬ 
quence, the cost of a megapixel sensor in the SWIR regime 
is still greater than tens of thousands dollars. Hence, despite 
their immense potential, high-resolution SWIR cameras are 
beyond the reach of engineers and scientists in application 
domains that could most benefit from its use. 

In this paper, we leverage the theory and practice of 
compressive sensing (CS) 11113, to enable high-resolution 
SWIR imaging from low-resolution sensor arrays. CS re¬ 
lies on the ability to obtain arbitrary linear measurements 
of the scene; this requires a fundamental redesign of the ar¬ 
chitecture used to image the scene. The single-pixel camera 
(SPC) is an example of such an architecture H. The SPC 
uses a digital micro-mirror device (DMD) as a spatial light 
modulator and acquires coded measurements of an image 
onto a single photo-detector. We can build an SWIR SPC 
by employing a photo-detector sensitive to SWIR along 
with the appropriate choice of optical accessories. The pro- 
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grammable nature of the DMD enables the photo-detector 
to obtain the sum of any arbitrary subset of pixels. The high- 
resolution image can then be computationally reconstructed 
from a small number of such measurements. 

A SPC is incapable of producing high-resolution images 
at video rate. To understand why, we first observe that the 
measurement rate of an SPC is determined by the operat¬ 
ing speed of its DMD which seldom goes beyond 20 kHz. 
At this measurement rate, conventional Nyquist-based sam¬ 
pling with a single-pixel can barely support a 20 fps video 
at a spatial resolution of 32 x 32 pixels. To sense at a mega¬ 
pixel resolution and video rate using a CS-based SPC, we 
would need a measurement rate of tens of millions of mea¬ 
surements per second — a gap of lOOOx that is beyond the 
capabilities of current CS techniques. 

Applications of SWIR Imaging. A large number of ap¬ 
plications that are difficult or impossible to perform us¬ 
ing the visible spectrum become much simpler due to the 
characteristics of the SWIR spectrum. SWIR imaging is 
currently used in a host of applications including auto¬ 
motive, electronic board inspection, solar cell inspection, 
produce inspection, identification and sorting, surveillance, 
anti-counterfeiting, process quality control, and much more 
03. Some unique properties of SWIR that enable many of 
these applications include (a) improved penetration through 
scattering media including tissue, fog, and haze, (b) seeing 
through many commonly-used packaging materials which 
are transparent in SWIR while opaque in visible spectrum, 
and (c) observing defects and gauging quality of fruits and 
vegetables. Finally, for night-time surveillance applica¬ 
tion, the night-glow of the sky provides sufficient SWIR 
illumination even on a moon-less night; this enables long¬ 
distance SWIR imaging without the need for extra illumi¬ 
nation sources that could compromise reconnaissance. 

Contributions. This paper enables a novel class of CS ar¬ 
chitectures that achieve high spatial and temporal resolu¬ 
tions using inexpensive low-resolution sensors. The main 
technical contributions in this paper are: 

• We characterize the spatio-temporal resolution limits of 
CS architectures. A key finding is that a space-bandwidth 
product mismatch between the DMD and the photo¬ 
detector results in sub-optimal performance. 

• We propose the focal plane array-based compressive 
sensing (FPA-CS) camera—an imaging architecture that 
is optically identical to thousands of SPCs acquiring 
compressive measurements in parallel (see Figure [^. 
FPA-CS balances space-bandwidth product constraints, 
thereby enabling CS-based imaging architectures with 
higher spatial and time resolutions. 

• We develop a prototype FPA-CS camera in SWIR and 
demonstrate capturing 1 megapixel images at video rate, 
far exceeding the capabilities of current methods. 


2. Related work 

Compressive sensing (CS). Compressive sensing ad 
deals with the estimation of a signal x G from M < N 
linear measurements y G of the form 

y = Ax + e, (1) 

where e is the measurement noise and A is the measure¬ 
ment matrix. Estimating the signal x from the compressive 
measurements y is an ill-posed problem since the system of 
equations is under-determined. Nevertheless, a fundamen¬ 
tal result from CS theory states that a robust estimate of the 
vector X can be obtained from M ^ K \og{N/K) mea¬ 
surements if the signal x admits a iT-sparse representation 
and the sensing matrix A satisfies the so-called restricted 
isometry property El. Furthermore, signals with sparse 
transform-domain coefficients or sparse gradients can be es¬ 
timated stably from the noisy measurement y by solving a 
convex problem 1411^. 

Compressive imaging architectures. In the context of 
video CS, there are two broad classes of architectures: spa¬ 
tial multiplexing and temporal multiplexing cameras. Fig¬ 
ure [^provides a comparison of various CS architectures 

Spatial multiplexing cameras (SMCs) acquire coded, 
low-resolution images and super-resolve them to obtain 
high-resolution images. In particular, they employ a spa¬ 
tial light modulator (SLM), e.g., a digital micro-mirror de¬ 
vice (DMD) or liquid crystal on silicon (LCoS), to optically 
compute linear projections of the scene x; these linear pro¬ 
jections determine the rows of the sensing matrix A in 0. 
Since SMCs are usually built with only low-resolution sen¬ 
sors, they can operate at wavelengths where full-frame sen¬ 
sors are too expensive. 

A prominent example of SMC is the single pixel camera 
(SPC) il; its main feature is to sense using only a single 
sensor element (i.e., a single pixel) and that the number of 
multiplexed measurements required for image reconstruc¬ 
tion is significantly smaller than the number of pixels in the 
scene. In the SPC, light from the scene is focused onto a 
programmable DMD, which directs light from only a sub¬ 
set of activated micro-mirrors onto the photodetector. By 
changing the micro-mirror configurations, we can obtain 
linear measurements corresponding to the sensing model in 
0. Several multi-pixel extensions to the SPC have been 
proposed recently, with the goal of increasing the measure¬ 
ment rate (161 Ull EOl ESI- To our knowledge, ours is the 
only design that focuses on sensing in SWIR wavebands. 

A commercial version of the SPC for sensing in SWIR 
has been produced by In View Corporation. A key differ¬ 
ence between our proposed architecture and the In View 
camera is the number of sensing elements. To our knowl¬ 
edge, the In View camera, much like the SPC, uses a single 
photo diode, whereas we use a sensor with 64 x 64 pixels. 


SMCs for video CS also make use of a diverse set of 
signal models and constraints including 3D wavelets Ea, 
multi-scale wavelet lifting 1^ . optical flow-based recon¬ 
structions WM, block-based models 0, sparse frame-to- 
frame residuals oaia, linear dynamical systems Emm, 
and combinations of low-rank and sparse matrices i37l . 
One characteristic of all these algorithms is that reconstruc¬ 
tion performance improves with increasing number of mea¬ 
surements. However, the measurement rate in traditional 
SPC architectures is too low to support high resolution, high 
frame rate reconstructions. 

In sharp contrast to SMCs, temporal multiplexing cam¬ 
eras (TMCs) use full-frame sensors with low frame rates 
and aim to super-resolve videos temporally, i.e., produce 
high frame rate videos from low frame rate sensors. Veer- 
araghavan et al. ll33]| showed that periodic scenes could be 
imaged at high temporal resolutions by using global tempo¬ 
ral coding. This idea was extended to non-periodic scenes 
in ca where a union-of-subspace models was used. Per- 
pixel temporal modulation to recover higher frame-rates 
was demonstrated using prototypes that used LCOS for 
modulation CHEiini. Llull et al. (H propose a TMC 
that uses a translating mask in the sensor plane to achieve 
temporal multiplexing. Harmany et al. ca extend coded 
aperture systems by incorporating a flutter shutter ll^ ; the 
resuling TMC provides immense flexibility in the choice of 
measurement matrix. A common feature for all TMCs is the 
use of a high resolution sensor; this makes them inapplica¬ 
ble for SWIR imaging where high resolution sensor arrays 
are prohibitively expensive. 

Super-resolution (SR). SR is a technique that is com¬ 
monly used to enhance the resolution of a given image. 
Traditional SR works by utilizing image priors or acquir¬ 
ing multiple measurements of the same scene 1^ . How¬ 
ever, traditional SR cannot arbitrarily increase resolution, 
and even state-of-the-art SR algorithms are limited to 2-4 x 
upsampling Co). In contrast, for a loss in temporal resolu¬ 
tion, our proposed architecture can achieve the full spatial 
resolution of the DMD without compression (see Section 
and Figures Qf), [^b)) with the sensor pixel-wisely scan¬ 
ning of the DMD, which can be seen as implementing SR 
in the optical domain: at each sensor pixel, we take 256 
non-overlapping sub-pixel measurements by turning on dif¬ 
ferent DMD pixels that map to the given sensor pixel. If we 
were to procure a DMD with higher resolution, the camera 
system inherits that same resolution. 

3. Spatio-temporal resolution (STR) 

Nyquist cameras STR. The STR of a camera is limited 
by the product of the number of pixels and the maximum 
frame rate. For example, a 1 megapixel sensor operating 
at 30 fps provides a measurement rate, which we denote as 
Mr, equals to 30 x 10^ samples per second. Traditional 
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Figure 2. Comparisons of some existing CS-based cameras. 


cameras rely on the principle of Nyquist sampling; thus, for 
such cameras STR = Mr. Pixel count and frame rate of 
visible sensors have signiflcantly improved, and it is now 
common to obtain sensors that can achieve megapixel res¬ 
olution at 30 fps. Unfortunately sensors outside the visible 
spectrum either provide a much lower spatial resolution, or 
they are quite expensive. 

Compressive cameras STR. Let us consider a CS-based 
camera operating at measurement rate, Mr samples per sec¬ 
ond, that can provide a high-resolution video with aMr 
pixels per seconds. The effective STR can be written as 
STR = aMr, where a > 1 represents the compression fac¬ 
tor by which the sampling rate reduced thank to the com¬ 
pressive sensing framework. In a SPC, the measurement 
rate is typically limited by the maximum rate at which syn¬ 
chronization can be achieved between the DMD modulator 
and the sensor. While a photo-detector can be operated at 
very high rates (even GHz), commercially-available DMD 
seldom operate faster than 10-20 kHz. Hence, it is not pos¬ 
sible to achieve synchronization between any of the cur¬ 
rent high resolution spatial light modulators and a photo¬ 
detector at greater than fjjM d = 20 kHz. This directly im¬ 
poses a limit on the STR of compressive cameras based on 
single pixel sensors, i.e., STR < aMr = ol/dmd samples 
per second. 

Increasing the measurement rate. From the previous 
discussion, it is clear that in order to increase the STR of 
CS-based imaging systems, one must increase the measure¬ 
ment rate. Given that the operating speed of the DMD poses 
strict limits on the number of frames we can obtain in unit 
time, one approach is to increase the measurement rate by 
reading multiple measurements in parallel. As an exam¬ 
ple, a compressive imaging system, in which a AT x AT 
pixel image sensor array is used to acquire multiplexed 
measurements in synchronization with a DMD at an op¬ 
erational rate Jdmd Hz, provides a measurement rate of 
Mr = K‘^fDMD samples per second—a K‘^ times im¬ 
provement over the SPC. This increased measurement rate 
enables the acquisition of videos at higher spatial and tem¬ 
poral resolution. In the next section, we describe a SWIR 




prototype that uses a 64 x 64 focal plane array sensor along 
with a DMD operating at /dmd = 480 Hz to achieve mea¬ 
surement rates in millions of samples per second. 

4. Specifics of the FPA-CS prototype 

System Architecture. Figure shows a schematic of our 
design and a photograph of our prototype. We utilized a 
Texas Instruments Light Grafter DMD as the light modula¬ 
tor. The DMD consists of 1140 x 912 micro-mirrors, each 
of size 7.6 micron. Each mirror can be independently ro¬ 
tated to either 4-12° or —12° around the optical axis at a 
rate of 2.88 KHz. We used a SWIR objective lens (Edmund 
Optics #83 — 165) with a focal length of 50 mm to fo¬ 
cus the scene on to the DMD. A 150 mm-150 mm relay 
lens pair (2x Thorlabs AC254-150C) was placed after the 
SWIR objective lens to extend the original flange distance, 
thereby providing ample space for the light bundle reflect¬ 
ing out of the DMD . We also used a 50 mm field lens 
(Thorlabs LB 1417) to reduce vignetting. The light incident 
on the DMD corresponding to pixels that are oriented at 
— 12° is discarded, while the light that is refiected from pix¬ 
els that are oriented at +12° is focused on the SWIR sensor 
using a re-imaging lens. We used a 100 mm-45 mm lens 
pair (Thorlabs AC-254-lOOC and AC-254-045C) as our re¬ 
imaging lens, which provides a 1:2.22 magnification as the 
physical sizes of the sensor and the DMD are different. 

We used a 64 x 64 SWIR sensor (Hamamatsu G11097- 
0606S), with 50 micron pixel size. The relay lens is config¬ 
ured such that roughly 16 x 16 pixels on the DMD map to 
one sensor pixel. Since the DMD has a rectangular shape 
and the sensor has a square one, some of the DMD pixels 
are not mapped on the sensor. In our prototype, the sensor 
measures a square region of the DMD with approximately 
600,000 micromirrors, which we up-sample to a megapixel 
image. Furthermore, since the DMD and the sensor planes 
are not parallel, we adjusted the relay lens and the sensor 
position to satisfy the Scheimpfiug principle so that the en¬ 
tire scene plane remains in focus on the sensor 1211 . 
Achievable measurement rates. In our prototype, we 
operate the DMD at foMD = 480 Hz in synchroniza¬ 
tion with the readout timing of the 64 x 64 sensor array. 
Therefore, we obtain the measurement rate, Mr 2 x 10^ 
pixels/sec. In our experimental results, we demonstrate 
high-fidelity video reconstruction at a compression rate of 
a = 16. Thus, the effective STR of our system is limited by 
STR < aK‘^fDMD = 32 X10^ pixels/second, which makes 
recovery of videos at spatial resolution of 1 megapixel at ap¬ 
proximately 32 frames/sec possible. In comparison, a SPC 
with the DMD operating at 20,000 Hz and the compres¬ 
sion factor a = 16 would provide 1 megapixel at 0.32/ps. 
Clearly, the mismatch between the measurement rate of the 
sensor and the operating speed of the DMD is the major 
bottleneck in SPC. 
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Figure 3. (a) Schematic illustration of the optical system, (b) Pho¬ 
tograph of our prototype. 


It is worth noting here that an equivalent full-frame 
SWIR sensor that can operate at 1 megapixel spatial resolu¬ 
tion over 30 fps would cost upwards of $60,000. In contrast, 
our 64 X 64-pixel SWIR sensor costs approximately $2,000 
and the DMD with its evaluation board costs approximately 
$1,500, resulting in a total cost of under $4,000 for our FPA- 
CS prototype. 

5. Reconstruction algorithms 

Forward imaging model of FPA-CS. Our FPA-CS pro¬ 
totype is equivalent to an array of 64 x 64 SPCs working in 
parallel. The DMD and sensor are synchronized to project 
modulation patterns and record coded low-resolution im¬ 
ages at a certain frame rate, say foMD fps; we used 
Idmd = 480 fps in our experiments. 

Let us describe the sensor image measured at time t us¬ 
ing the following matrix-vector notation: yt = AfXt, where 
yt is a vector with sensor measurements at 4096 pixels, Xt 
represents the unknown high-resolution image formed at the 
DMD plane, and the matrix At encodes modulation of Xt 
with the DMD pattern and subsequent mapping onto the 
SWIR sensor pixels. Thus, At can be decomposed into two 
components as At = CDt\ Dt denotes a diagonal matrix 








that contains the binary pattern projected on the DMD at 
time t; and every Ci^j entry in C represents the response 
of a unit light from jth DMD mirror onto ith sensor pixel. 
C is a highly sparse matrix, because only a small group of 
roughly 16 x 16 DMD mirrors map to a single pixel on the 
SWIR sensor. Therefore, for a stationary camera assembly, 
we can estimate C using a separate, one-time calibration 
procedure, which is used in all subsequent experiments. 

To reconstruct video at a desired frame rate, say Fr fps, 
we divide low-resolution sensor images, yt, into sets of T = 
foMo/Fr measurements, and assume that all of them cor¬ 
respond to the same high-resolution image. Suppose the 
set correspond to yt = AfXt for t = {k — 1)T + 1,..., /cT; 
we assume that Xt = x/^ and stack all the yt and At in the 
kth set to form a larger system of equations. For instance, 
the system for /c = 1 can be written as 
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Our goal is to reconstruct the x/^ from the noisy and possibly 
under-determined sets of linear equations jk = A/^x/^. 


Total variation-based reconstruction. Natural images 
have been shown to have sparse gradients. For such sig¬ 
nals, one can solve an optimization problem of the follow¬ 
ing form |[6||22]|: 

X = argmin TV(x) subject to ||y — Ax II 2 < e, (3) 

X 

where the term TV(x) refers to the total-variation of x. In 
the context of images where x denotes a 2D signal, the op¬ 
erator TV(x) can be defined as 

TV(x) = y(Z)„x(z))2 + (Z)„x(i))2, 

where D^^x and D^x are the spatial gradients along hor¬ 
izontal and vertical dimensions of x, respectively. In the 
context of video signals, we can also exploit the similar¬ 
ity between adjacent frames along the temporal direction. 
We can view a video signal as a 3D object that consists of 
a sequence of 2D images, and we expect the signal to ex¬ 
hibit sparse gradients along spatial and temporal directions. 
Thus, we can easily extend the definition of TV operator to 
include gradients in the temporal d direction. 

TV3d(x) = Y y(-D„x(i))2 + (L)„x(i))2 + {Dtx{i))^, 

i 

where Dt^ represents gradient along the temporal dimen¬ 
sion of X. In our experimental results, we used TVAL3 EM 
for the reconstruction of images and MFISTA O for the 
reconstruction of videos. 


6. Experiments 

To demonstrate the performance of our device, we show 
results on several static and dynamic scenes captured using 
our prototype SWIR FPA-CS camera. 

Resolution chart. To study the spatial resolution charac¬ 
teristics of our system, we first captured images of a USAF 
1951 target using our prototype device. For each measure¬ 
ment, the DMD projected a random binary spatial pattern 
and the sensor recorded a 64 x 64 image. We adjust the 
exposure duration for each acquired images to 0.8 ms and 
acquired 512 sensor images with varying patterns on the 
DMD. We then reconstructed the scene at the same reso¬ 
lution as that of the DMD using the TV-regularized recon¬ 
struction algorithm described in Section To study the 
impact of the number of measurements, we reconstructed 
the image with T = 64,128, 256, and 512 cpatured images, 
which correspond to compression of a = 4, 2,1, and 0.5, 
respectively. We defined a ^ 10^/(T x 4096). Figure]^ 
presents the results obtained at various compression rates. 

Figure Qa) presents an up-sampled version of a single 
64 X 64 image using bicubic interpolation. Figure Qb)- 
(e) present images reconstructed with different number of 
measurements; the spatial quality of reconstructed images 
improves as the number of measurements increases. Fig¬ 
ure |^f) presents an image reconstructed using direct pixel- 
wise scanning of the DMD without multiplexing. This al¬ 
lows us to acquire images at the maximum spatial resolution 
that FPA-CS can provide (limited only by the performance 
of optics), at the sacrifice of temporal resolution. To per¬ 
form the direct pixel-wise scanning, we divided the DMD 
into 18 X 20 regions that map to non-overlapping sensor 
pixels. We sequentially turned on one micromirror in all 
the groups and recorded the respective sensor images. In 
this manner, we can compute the image intensity at every 
micromirror location. Such pixel-wise scanning, with non¬ 
overlapping division of the DMD, requires approximately 
3000 images to be captured. Such scanning can also be con¬ 
sidered a super resolution scheme performed in the optical 
domain, but it can only be used for static scenes. 

Overall, results in Figure demonstrates that high- 
resolution images can be obtained from a small number of 
multiplexed images. Furthermore, FPA-CS provides fiexi- 
ble tradeoff between spatial and temporal resolutions of the 
reconstructed signals. As we increase the number of im¬ 
ages used for reconstruction, the spatial quality improves, 
but the imaging interval per frame also increases. There¬ 
fore, a small number of multiplexed images can be used to 
reconstruct dynamic scenes at high temporal resolution, or 
static scenes can be reconstructed using a large number of 
multiplexed images. 

Highlighting capabilities of imaging in SWIR. We 

present a simple experiment that highlights two attributes 
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Figure 4. FPA-CS results for a resolution chart, (a) Interpolation of a 64 x 64 sensor image, (b)-(e) Images reconstructed using increasing 
number of measurements (T); compression factor a ^ 10®/(T x 4096); a larger value of T would translate to smaller frame rate of 
reconstruction for a dynamic scene, (f) Direct pixel-wise scanning result, where no multiplexing is performed and T ^ 3000. 


of SWIR imaging. The scene in Figure when observed 
in the visible spectrum, is largely unremarkable and con¬ 
sists of an opaque bottle and a crayon scribble in the back¬ 
ground. However, the corresponding SWIR images show 
two interesting differences. First, note that the crayon scrib¬ 
ble is transparent in SWIR, therefore allowing us to read the 
text behind it. Second, the bottle is partially transparent in 
SWIR, therefore allowing us to see through the bottle and 
determine the water level inside the bottle. Figure also 
shows ID plots of the mean intensity of columns inside the 
highlighted color boxes; SWIR intensity changes because 
of the water inside the bottle, but the visible light intensity 
remains unchanged. 

Video reconstruction. Figure shows results on two dy¬ 
namic scenes. In the moving car and moving hand video, 
we grouped the captured sequence into sets of T = 16 and 
T = 22 images, respectively. We used one such set to rep¬ 
resent each frame of the video according to this corre¬ 
spond to videos at 32 fps and 21.8 fps, with compression 
factors a = 16 and a = 11.6, respectively. We used 3D TV 
optimization problem described in ^ for the reconstruc¬ 
tion. Complete videos can be found in the supplemental 
material. 

7. Discussions 

Artifacts. Some artifacts can be observed in the output 
images and videos from the FPA-CS system. The two ma¬ 
jor types of artifacts that can be observed are: (1) motion 
artifacts that occur at sharp edges of moving objects in the 
scene, and (2) “blocky” structural artifacts that underlie the 
entire captured scene. The motion artifacts are simply an ar¬ 
tifact of motion blur while the “blocky” artifacts are caused 
by small misalignments in the mapping between the sensor 


pixel and DMD pixel introduced after the system calibra¬ 
tion. In practice, we observed that applying a 3D median 
filter can largely suppress both artifacts. 

Choice of modulation masks. To obtain high fidelity re¬ 
construction, modulation masks should meet two condi¬ 
tions. First, the system matrix in (|^ —the combined system 
of 4096 SPCs in every set of T frames—should be well- 
conditioned so that the image reconstruction process is sta¬ 
ble and robust to noise. Second, the spatial code should 
have high light-throughput that maximizes the signal-to- 
noise ratio in sensor images. In this paper, we tested two 
mask patterns-Hadamard and random binary-both of which 
satisfy these characteristics. Hadamard matrices are known 
to be optimal in terms of linear multiplexing 1291 . The 
results shown in Figure correspond to Hadamard mea¬ 
surements. We also used random binary patterns since 
they are known to satisfy the restricted isometry property 
Q, and therefore lead to robust inversion when used along 
with sparse regularizers such as total variation. The re¬ 
sults shown in Figure]^ and [^correspond to binary random 
measurements. Notice that in both cases, 50% of the light 
reaches the sensor after modulation. In practice, we ob¬ 
served that the reconstructions obtained from the two mod¬ 
ulation patterns were near-identical. 

Benefits. FPA-CS provides three advantages over conven¬ 
tional imaging. First, our CS-inspired FPA-CS system pro¬ 
vides an inexpensive alternative to achieve high-resolution 
SWIR imaging. Second, compared to traditional single 
pixel-based CS cameras, FPA-CS simultaneously records 
data from 4096 parallel, compressive systems, thereby sig¬ 
nificantly improving the measurement rate. As a conse¬ 
quence, the achieved spatio-temporal resolution of our de- 



























































(a) Bicubic up-sampling (b) Direct pixel-wise (c) Reconstruction from (d) Reconstruction from (e) Visible DSLR image of columns in 

of the 64x64 sensor image scanning 128 images 256 images of the scene highlighted box 


Figure 5. SWIR imaging of an opaque bottle in the foreground and a crayon scribble in the background, (a) Bicubic interpolated version 
of a 64 X 64 sensor image, (b) Direct pixel-wise scanning result, (c) and (d) Images reconstructed with different compression factor a. 
(e) Visible image of the scene taken with a DSLR camera. Notice that the crayons are transparent in SWIR allowing us to read the text 
behind the scribbles. The bottle in the foreground is opaque in visible but transparent in SWIR. Highlighted boxes on the right display 
mean intensity of columns in the corresponding regions of the SWIR and visible images, showing that one can estimate the water level 
inside the bottle from the SWIR image, but not from the visible image. 

Scene Setup Frame 1 Frame 37 Frame 74 Frame 110 XT Slice YT Slice 



Scene Setup Frame 1 Frame 27 Frame 54 Frame 80 XT Slice YT Slice 


Figure 6. Selected frames from reconstructed SWIR videos. Each frame in the moving car videos is reconstructed using T = 16 captured 
images {a = 16) at 32-fps. Each frame in the moving hand videos is reconstructed using T = 22 captured images (a = 11.6) at 21.8-fps. 
Both videos are reconstructed using 3D-TV prior. XT and YT slices for both videos are shown to the right of the images. 


vice is orders of magnitude better than the SPC. 

Limitations. FPA-CS exploits spatio-temporal redun¬ 
dancy in the reconstruction, therefore, extremely complex 
scenes such as a bursting balloon cannot be directly handled 
by the camera. Since the spatio-temporal redundancy ex¬ 
ploited by traditional compression algorithms and our imag¬ 
ing architecture are very similar, one can assume that the 
scenes that can be compressed efficiently, can also be pro¬ 
cessed well using our method. Our prototype uses a binary 
per-pixel shutter, which causes a 50% reduction in light 
throughput as half of the light is wasted. In future, a sep¬ 
arate, synchronized 64 x 64 image sensor can be used in 
the other arm, thereby doubling the measurement rate and 
further increasing the spatio-temporal resolution that can be 
achieved. The algorithm is currently not real-time and thus 
precludes the direct-view capability. 

Conclusion. We presented focal plane array-compressive 
sensing (FPA-CS), a new imaging architecture for paral¬ 
lel compressive measurement acquisition that can provide 


quality videos at high spatial and temporal resolutions in 
SWIR. The architecture proposed here is generic and can be 
adapted to other spectral regimes such as mid-wave infra¬ 
red and thermal imaging, where, much like SWIR, sensors 
are prohibitively expensive. 
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